# CatalystLLM 

(README.md)
Abstract: The discovery of novel catalytic materials is a cornerstone of chemical engineering and sustainable energy, yet it remains a complex, knowledge-intensive process. While Large Language Models (LLMs) have demonstrated remarkable potential in various scientific domains, their application to catalysis is hindered by the lack of specialized, multi-dimensional benchmarks to guide their development and evaluation. To bridge the critical gap, we introduce CatalystBench, a comprehensive and challenging benchmark meticulously constructed from scientific literature and public datasets, specifically designed to assess the capabilities of LLMs in the nuanced domain of catalyst design. The tasks covered by this benchmark dataset encompass the entire closed-loop process of catalyst development, including reading comprehension, experimental analysis, and scheme reasoning. Based on this benchmark, we propose a Multi-head Full-task (MFT) domain-specific fine-tuning method that employs coupling task-specific output heads. We systematically compare with other three distinct fine-tuning strategies: Single-Task (ST), Full-Task (FT) and Multi-head Single-Task (MST). The extensive experiments demonstrate that the MFT strategy consistently achieves the most substantial performance improvements across all tasks, underscoring the effectiveness of explicit multi-task architectures in complex scientific reasoning. The resulting CatalystLLM significantly outperforms a wide array of state-of-the-art opensource and closed-source models on CatalystBench. We will publicly release both the CatalystBench benchmark and the CatalystLLM model, providing the community with a robust evaluation framework and a powerful new tool to accelerate AI-driven research in catalytic materials science


- **Base Model:** [ChemLLM-7B](https://huggingface.co/AI4Chem/ChemLLM-7B-Chat/tree/main)
- **Fine-tuned Model:** [CatalystLLM-7B-SFT]()
- **Dataset:** [CatalystData]()
- **SwanLab**：[CatalystLLM-sft]()
- **Fine-tuning Methods:** LoRA Fine-tuning
- **Inference Style:** R1 Reasoning Style
- **Hardware Requirements:**
  - **Full-parameter Fine-tuning:** 32GB VRAM
  - **LoRA Fine-tuning:** 28GB VRAM

## Environment Setup

```bash
pip install -r requirements.txt
```

## Data Preparation

Automatically handles dataset downloading, preprocessing, and validation set splitting. Generates `train.jsonl` and `val.jsonl`.

```bash
python data.py
```

## Training

### LoRA Fine-tuning

```bash
python FT_train_lora.py
```

## Inference

**LoRA Fine-tuning**

```bash
python inference_lora.py
```

## Related Tools

- [SwanLab](https://github.com/SwanHubX/SwanLab): Open-source, modern deep learning experiment tracking and visualization platform
- [Transformers](https://github.com/huggingface/transformers): HuggingFace's library for state-of-the-art pretrained models
- [PEFT](https://github.com/huggingface/peft): Parameter-Efficient Fine-Tuning library for large language models